AITopics | skill discovery algorithm

Do's and Don'ts: Learning Desirable Skills with Instruction Videos

Neural Information Processing SystemsMar-20-2026, 16:04:35 GMT

Unsupervised skill discovery is a learning paradigm that aims to acquire diverse behaviors without explicit rewards. However, it faces challenges in learning complex behaviors and often leads to learning unsafe or undesirable behaviors. For instance, in various continuous control tasks, current unsupervised skill discovery methods succeed in learning basic locomotions like standing but struggle with learning more complex movements such as walking and running. Moreover, they may acquire unsafe behaviors like tripping and rolling or navigate to undesirable locations such as pitfalls or hazardous areas.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

55576bcdf386ba73859fb71766f85758-Paper-Conference.pdf

Neural Information Processing SystemsFeb-13-2026, 14:54:56 GMT

Unsupervised skill discovery is a learning paradigm that aims to acquire diverse behaviors without explicit rewards.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

55576bcdf386ba73859fb71766f85758-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 02:55:39 GMT

arxiv preprint arxiv, instruction network, video, (12 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Focused Skill Discovery: Learning to Control Specific State Variables while Minimizing Side Effects

Carr, Jonathan Colaço, Sun, Qinyi, Allen, Cameron

arXiv.org Artificial IntelligenceOct-7-2025

Skills are essential for unlocking higher levels of problem solving. A common approach to discovering these skills is to learn ones that reliably reach different states, thus empowering the agent to control its environment. However, existing skill discovery algorithms often overlook the natural state variables present in many reinforcement learning problems, meaning that the discovered skills lack control of specific state variables. This can significantly hamper exploration efficiency, make skills more challenging to learn with, and lead to negative side effects in downstream tasks when the goal is under-specified. We introduce a general method that enables these skill discovery algorithms to learn focused skills -- skills that target and control specific state variables. Our approach improves state space coverage by a factor of three, unlocks new learning capabilities, and automatically avoids negative side effects in downstream tasks.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2510.04901

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment (0.68)
Education > Focused Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)

Add feedback

Do's and Don'ts: Learning Desirable Skills with Instruction Videos

Neural Information Processing SystemsMay-27-2025, 01:34:16 GMT

Unsupervised skill discovery is a learning paradigm that aims to acquire diverse behaviors without explicit rewards. However, it faces challenges in learning complex behaviors and often leads to learning unsafe or undesirable behaviors. For instance, in various continuous control tasks, current unsupervised skill discovery methods succeed in learning basic locomotions like standing but struggle with learning more complex movements such as walking and running. Moreover, they may acquire unsafe behaviors like tripping and rolling or navigate to undesirable locations such as pitfalls or hazardous areas. In response, we present DoDont (Do's and Dont's), an instruction-based skill discovery algorithm composed of two stages.

artificial intelligence, machine learning, skill discovery algorithm, (4 more...)

Neural Information Processing Systems

Industry: Education > Educational Technology > Audio & Video (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

Do's and Don'ts: Learning Desirable Skills with Instruction Videos

Kim, Hyunseung, Lee, Byungkun, Lee, Hojoon, Hwang, Dongyoon, Kim, Donghu, Choo, Jaegul

arXiv.org Artificial IntelligenceJun-1-2024

Unsupervised skill discovery is a learning paradigm that aims to acquire diverse behaviors without explicit rewards. However, it faces challenges in learning complex behaviors and often leads to learning unsafe or undesirable behaviors. For instance, in various continuous control tasks, current unsupervised skill discovery methods succeed in learning basic locomotions like standing but struggle with learning more complex movements such as walking and running. Moreover, they may acquire unsafe behaviors like tripping and rolling or navigate to undesirable locations such as pitfalls or hazardous areas. In response, we present DoDont (Do's and Don'ts), an instruction-based skill discovery algorithm composed of two stages. First, in an instruction learning stage, DoDont leverages action-free instruction videos to train an instruction network to distinguish desirable transitions from undesirable ones. Then, in the skill learning stage, the instruction network adjusts the reward function of the skill discovery algorithm to weight the desired behaviors. Specifically, we integrate the instruction network into a distance-maximizing skill discovery algorithm, where the instruction network serves as the distance function. Empirically, with less than 8 instruction videos, DoDont effectively learns desirable behaviors and avoids undesirable ones across complex continuous control tasks. Code and videos are available at https://mynsng.github.io/dodont/

arxiv preprint arxiv, instruction network, video, (11 more...)

arXiv.org Artificial Intelligence

2406.00324

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Education > Educational Technology > Audio & Video (0.81)

Technology: